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Abstract 

One aspect of fault-tolerance in process control programs is the 
ability to tolerate sensor failure. This paper presents a methodology 
for transforming a process control program that cannot tolerate sensor 
failures to one that can. Additionally, a hierarchy of failure models is 
identified. 

Keywords: fault-tolerance, process control systems, real-time 
distributed systems. 


1 Introduction 

A process control program communicates and synchronizes with a physical 
process. Typically, the program reads values from the physical process 
through sensors and writes values through actuators, as shown schemati- 
cally in figure 1. 

’Submitted to the 10th Real-Time Systems Symposium, Los Angeles, December 1989. 
f This work was supported by the Defense Advanced Research Projects Agency (DoD) 
under ARPA order 6037, Contract N00140-87-C-8904 The views, opinions, and findings 
contained in this report are those of the authors and should not be construed as an official 
Department of Defense position, policy, or decision. 
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Figure 1: A real-time program 

This paper concerns sensor failures. We assume that a programmer writes 
the program of figure 1 assuming that sensors return accurate values , and 
provide a methodology for transforming this program to one that tolerates 
sensors that return inaccurate values. 

There are two ways to tolerate sensor failures: 

1. Based on the sensor’s specification, the control program can deter- 
mine that a sensor has failed from the value provided by the sensor. 
For example, if the control program used a sensor to measure the 
temperature of a reaction vessel and the thermometer read a value 
too high to be realistic, then the control program would know the 
thermometer has failed. 

2. The sensor can be replicated, either physically or logically, as shown 
in figure 2. The values of the replicated sensors are compared and 
a correct sensor value calculated. For example, instead of one ther- 
mometer, there could be two thermometers and one pressure gauge 
on the vessel. From Boyle’s law PV = nRT , we can calculate three 
independent temperature readings (or pressure readings, if desired). 
As long as no more than one sensor fails, the control program can 
continue to execute correctly. [Sch86a] 

These two approaches can either be used independently or together. How- 
ever, the second approach is better suited for process control programs, 
because the program always gets an answer in a predictable amount of 
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Figure 2: Replicated sensors 

time. The first approach can only be used if no deadline will be missed due 
to the loss of sensor information. 

The approach we develop is as follows: 

1. Write the process control program with reference to the actual state 
variables of the physical system. For example, the program control- 
ling the reaction vessel would refer to the temperature T. 

2. For each physical variable referenced by the control program through 
a sensor, replace it with a reference to an abstract sensor. An abstract 
sensor is an interval that contains the physical variable. This step can 
not be done automatically; the specification of the process control 
program will have to be changed to refer to abstract sensors. 

3. Implement a set of abstract sensors based on a set of physical sensors. 
A physical sensor is a device that “reads” a physical state variable. 
This step cannot be done automatically, since it may take some knowl- 
edge of the physical process to implement abstract sensors. 

4. Apply a fault-tolerant averaging algorithm to these abstract sensor 
values in order to calculate derive an abstract sensor that is correct. 
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The algorithm assumes that out of n sensors no more than some 
parameter / of them are incorrect. The relation between n and / 
(outside of n > f) depends on the way sensors can fail. We will 
assume that abstract sensors fail independently of each other. 

This paper is a generalization of the work done by the author and pre- 
sented in [M083,Mar84]. This earlier work looked at the problem of clock 
synchronization in a distributed system. A clock is a special kind of sensor, 
in that the physical process it senses can be expressed simply. In this paper, 
algorithm 1 and theorem 3 are taken directly from [Mar84]. Section 2 is 
new, as well as the different failure models discussed in section 3. 

The methodology presented in this paper can be thought of as a general- 
ization of the state machine approach [Sch86b,Lam84], A related problem, 
inexact agreement , is presented in [MS85], but the goals are different. Our 
goal is to dynamically calculate bounds on the accuracy of a sensor value, 
not to have multiple processors agree on a sensor value. A different ap- 
proach to agreement among sensors is taken in [Mac84], in which sensor 
failure is not considered. 

In section 2, we define a method of representing sensors that makes them 
amenable to replication. In section 3, we discuss sensor failure models and 
present a sensor averaging algorithm. Section 4 contains a demonstration 
of our methodology. 


2 Physical Variables and Sensors 

A variable in a computer process is quite different from a state variable in a 
physical process. A computer variable takes on values from a finite domain, 
and can assume only a bounded number of values in any finite time period. 
A physical state variable, however, may take on any real value and can 
have an unbounded number of values within any non-zero time period. A 
convenient way to represent physical state variables in computer programs 
is as functions. The domain of a physical variable is typically time , but it 
can be some other physical variable, depending on the safety properties of 
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interest 1 . 

There are three distinct values associated with a physical variable: the 
physical variable itself, the value returned by a sensor called here a physical 
sensor and the value of a physical variable derived from the value of a sensor 
called here an abstract sensor. In this section, a relation among these values 
is more precisely defined. 

2.1 Physical and Abstract Sensors 

A physical sensor is a device used by a computer to sample a physical vari- 
able. For example, a computer controlling a reaction vessel might have 
a thermometer as a physical sensor. The computer may obtain values 
from the sensor either by polling it for its current value or by being asyn- 
chronously alerted when a certain value is attained. It is convenient to 
think of a physical sensor as a mechanism that returns pairs of values, such 
as temperature and time, or position and velocity. By doing so, one can 
talk of the physical variable as being a function, such as T(t ) or v(x), and 
the physical sensor producing samples (T, f : T = T(t) or (v,x : v = v(x)). 
Either the sensor device or the controlling computer process may make this 
association. 

A physical sensor is not a very convenient mechanism. For example, with 
the thermometer attached to the vessel: 

• The sensor has a limited accuracy, giving an uncertainty to the tem- 
perature. This uncertainty is increased by delays incurred by sched- 
ulers and networks. 

• The control program may be interested in a temperature at a time 
the thermometer was not sampled. A value may be interpolated, but 
to do so requires knowledge of the physical process. 

• The sensor may have interesting perceptive properties; for example, 
it might generate an interrupt if the temperature rises above 100 
degrees. This is an important property of the sensor: it allows for an 

1 In the application of section 4, for example, the velocity of a train can be expressed 
as a function of location. 
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accurate determination of when 100 degrees is reached. There may be 
other ways to make the same kind of precise measurement, however. 
It would be convenient if the control program could be the same for 
any method of measurement, as long as the measurement is accurate 
enough. 

We define an abstract sensor as a piecewise continuous function from a 
physical variable to a dense interval of physical values 2 We will indicate an 
abstract sensor as a function such as 7(t). When possible, we will simply 
write T if we are interested in the “current” value; that is, the sensor value 
for the current value of t. Control programs will compute with abstract 
sensors rather than physical sensors. An abstract sensor 7 is correct if it 
always bounds the value of the actual physical value. More precisely, 


7 correct over D = f 

Vd € D : min 7(d) < T(d ) < max 7(d) 

Given a physical sensor, it may not be easy to implement an abstract sen- 
sor. In general, it may require considerable knowledge about the physical 
process being monitored. For example, suppose the manufacturer of the 
thermometer claims it returns a value T with an accuracy of e degrees, and 
the computer can read the sensor’s value within 8 seconds of the thermome- 
ter being sampled. If the time the computer program receives T is t, all we 
know is that 3to : t — 6 <to< t: T — e/2 < T(t 0 ) <T + e/2. This, alone, is 
not sufficient information to define an abstract sensor 7(t ), since we don’t 
know how to interpolate values between successive sensor readings. 

Suppose, however, we know from the physical process being monitored 
that 1 < A. This bound on the change of T allows us to interpolate 

2 An abstract sensor T(d) can be represented as a pair of function T m ,„(d) and T max (d), 
where T(d ) is the interval [T m ,„(d) .. T ma *(d)]. With this representation, 

min T(d) = T min (d), max T(d) = T mat (d), and |T(d)| = T max (d ) - T min (d) 
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intermediate values with a known accuracy. The abstract sensor T(() can 
be defined as 

T - e/2 - A (i - t + 8) < T(t) < T + e/2 + A (t - t + 6) 
for t > t 

Unlike physical sensors, abstract sensors can be compared with each other. 
In section 3, this property will be used to construct a fault-tolerant abstract 
sensor. 


2.2 Abstract Sensors in Specifications 

A specification of a process control program may have to be changed when 
expressed using abstract sensors. It is not possible to take a control program 
written in terms of physical variables and for each reference to a physical 
variable substitute a reference to the corresponding abstract sensor. For 
example, consider a reaction vessel with a pressure relief valve. One safety 
condition might be that whenever the pressure p is greater than some ceiling 
Pmax the valve is open (R)\ or, p > p max = R. The condition p > p max = R 
does not make sense; what does it mean for an interval (p) to be greater 
than a value? 

Let 5 be a condition on the system and V be the set of physical variables 
in 5 that will be accessed through abstract sensors. We need another 
condition S' that contains no references to any t;,- € V but may instead 
contain references to U7- The only a priori constraint we have on S' is that 
it reduces to 5 when the abstract sensors have perfect accuracy: 


(S' A (Vi : \vt{ = 0)) => S^' 

V»:U7 

There are several ways such an S' can be constructed; for example, we 
could replace all references to v t in 5 with references to the midpoint of V 
However, if we assume that all values in u7 have the same likelihood of being 
valid, we have two possibilities. We can either require that all points in TTi 
satisfy 5 or that there exists at least one point in tJ7 that satisfies 5. More 
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precisely, for each physical variable v, the condition 5 can be generalized 
as: 


S' = Vv, G Vi : S or 
S' = f G : 5 

The generalization of 5 cannot be done automatically, since it is really a 
refinement of the problem specification. One approach is to consider the 
states that are excluded by 5, and then expand this set if possible. In the 
example above, we are probably most interested in avoiding an explosion 
of the vessel. If so, the condition we want is (3p G p : p > p mai ) = R, and 
we would accept the risk that the pressure valve may open prematurely. 
As another example, we might want to assert that a catalyst is injected 
(C) only when the pressure is above a minimum value, or C => (p > p min ). 
In this case, the state we are trying to avoid is one where the catalyst is 
injected at too low a pressure, so we would generalize this condition to 
C => (Vp G p : p > Pmin)- Note that in this case we admit no states that 
violate the original condition. 


3 Fault -Tolerant Abstract Sensors 

Given n independent abstract sensors and a failure model, we would like 
to construct an abstract sensor that is provably non-faulty. We will first 
consider the simple failure model of arbitrary failures J . We will assume a 
sensor is either faulty, in which case it can return any value, or the sensor is 
correct and always returns a correct value. We assume that no more than 
/ of the n sensors can be faulty. 


3.1 Arbitrary Failures 

Let Y, and 7 * (i ^ j) be two abstract sensors for the same physical value 
T. If Ti and Tj are both correct, then by definition T must be in both 

3 This failure model has also been called Byzantine failures or malicious failures. 
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intervals. Put another way, the intervals T, and Tj must intersect, and 
their intersection must contain T. 

If we have no more than / faulty sensors, any set of n — / or more mutually 
intersecting sensors may be correct, since they each share a common value. 
Conversely, any point not contained in at least n — f intervals cannot be 
the correct value. Let l be the smallest value contained in at least n — f 
intervals and h be the largest value contained in at least n — f intervals. 
The correct value T is then bounded by l and h. This gives us our sensor 
averaging algorithm. 

Algorithm 1 Fault-tolerant Sensor Averaging 

Specification: Let S be a set of the values of n abstract sensors of 
the same physical state variable, read at the same point in their 
domain ( e.g . at the same time). Given a maximum number of 
faulty sensors /, find the smallest interval n /in (S) that contains 
the correct physical value. 

Implementation: Let l be the smallest value contained in at least 
n — f oi the intervals in S and h be the largest value contained 
in at least n — / of the intervals in S. Let H /,„(£) be the interval 
spanning l < r\ f<n (S) < h if / and k exist; otherwise, let n />n (S) 
be 0. 

Algorithm 1 is inexpensive - it can be easily implemented in 0(n log n) 
time, which is a lower bound. One implementation is given in [M083]. 

This definition of n^ n (<S) can contain values that we know cannot be the 
correct value. For example, figure 3 shows the intersection of three intervals 
a, b and c. If / = 1, the correct value must be within 71 or 72. By 
algorithm 1, however, we define the correct sensor value to be 7. We do 
this to preserve the “shape” of the sensor as seen by the control program. 
Our program is written for an abstract sensor, which is a single interval, 
and the interval defined by algorithm 1 is the smallest single interval that 
contains the correct value. 

The width of a sensor’s value determines its inaccuracy. The following 
theorem gives an upper bound on the number of faulty sensors one can have 
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Figure 3: Intersection with n = 3 and / = 1 

and still have a bound on the inaccuracy on the result. Define the operations 
mini and max, to be the i th smallest and largest values of their operand 
(a set of values) respectively. Note min, = max n _, +1 . For example, if 
5 = {13,14,15} then min 3 (5) = maxi(S) = 15. 

Theorem 1 If f < an< I ^ 0 , then 

|n/,„(5)| < min 2/+1 {|s| : s € 5} 

If/ > 2^- j , the derived interval can be more inaccurate than any sensor 
in the system. An example is shown in figure 4 4 . 

Theorem 1 bounds the accuracy of the derived sensor in terms of the accu- 
racy of the values read. However, if this bounding sensor is faulty, one might 
worry about the bound being meaningful. Consider the sensors shown in 
figure 5. If abstract sensor c is faulty and we allow faulty sensors to be 
arbitrarily inaccurate, the derived sensor can be more inaccurate than any 
correct sensor. 

4 Unless stated otherwise, all the proofs of the theorems given in this section are pre- 
sented in section 3.4. 
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Figure 4: Intersection with n = 3 and / = 2 > 


In many applications, there exists a limit on the inaccuracy of a sensor. For 
example, if our abstract temperature sensors are all polled at roughly the 
same time, the accuracy of the abstract sensors will not differ significantly 
from each other. Thus, the bound of theorem 1 is reasonable. If sensors 
can have widely differing accuracy, however, fewer failures can be tolerated. 
Theorem 2 gives this bound. 

Theorem 2 LetC be the (unknown) subset of $ that are correct. If f < |^J 
and r\ /tn (S) ^ 0, then 

I n />n (5)| < min /+1 {|s| : s € C} 

Under the conditions of theorem 2, a minimum of four sensors is necessary 
to tolerate a single faulty sensor. Figure 6 is an example of this case. 


3.2 Probabilistic Interpretation 

The previous section gives upper bounds on the inaccuracy of the correct 
abstract sensor calculated by algorithm 1. However, the actual accuracy 
can be much better. If an abstract sensor s,(f) is correct, we can define the 
probability distribution function of the physical value in the range 
Let this probability that s(t ) = s' € Ji(t ) be fi(s')ds. The expected value of 


11 



a 



Figure 5: Intersection with sensor c faulty 

the accuracy of the fault-tolerant sensor, E(\ n^ n (<S)|), has the following 
property. 

Theorem 3 Assume that each sensor is distributed identically and is inde- 
pendent of each other. If 

/i(min Si(t)) > 0 and fi ( max J,(t)) > 0 


then 


Jim F(| fl (<S)|) = 0 for any fixed f 

The rate of convergence to the correct physical value depends on the ac- 
tual distribution function. For example, let fi be the uniform distribution 
function 


*(-) 


/ 

l 0 


3 € T;(f) 
otherwise 


In this case, the expected value of | fi/ n (5)| approaches zero at a rate of 
0(l/n); here, even a modest amount of replication can yield a very accurate 
abstract sensor. A proof of this rate of convergence, along with a proof of 
theorem 3 can be found in [Mar84], 
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Figure 6: Intersection with n = 4, / = 1 and sensor d faulty 

As an aside, the probabilistic interpretation of this section might suggest a 
fuzzy logic approach to this problem [Zad75]. The interpretation is straight- 
forward: define A y (T ,(«S, s) as the disjunction of the (n-/) terms formed by 
the conjunction of n — f fuzzy boolean values b, whose membership is taken 
from the uniform probability distribution function /;($). A j n is equivalent 
to the intersection of the n — f cliques 5 in 5. If we changed algorithm 1 to 
return a set of intervals rather than a single interval, A and fl f n would 
be equivalent. 

3.3 Other Failure Models 

If a sensor is known a priori to be faulty, the bounds given in theorems 1 
and 2 can be improved. Faulty sensors need not be included in S when 
calculating n /,„(<£), and n and / are reduced accordingly. In particular, if 
we can identify / faulty sensors, we can calculate a correct abstract sensor 

5 A k— clique is an intersection of k intervals. 
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that is at least as accurate as the most accurate sensor. More formally, let 
W be the subset of S that are known to be faulty, and let |VV| = w < /. 
Then, C\j- W , n - w (S — W) is a correct interval, and theorems 1 and 2 hold. 

The main problem is determining when a sensor is faulty. Ideally, a sen- 
sor could be self-diagnosing, and return the value 0 when faulty. Follow- 
ing [Sch84], we will call such sensors fail-stop. With fail-stop sensors, we 
can tolerate up to n — 1 failures and will calculate a sensor that is at least 
as accurate as the most accurate correct sensor and at best approaches the 
exact physical value. 

Of course, fail-stop sensors are useful only if they can be implemented. We 
can use algorithm 1 to detect failed sensors under an arbitrary failure model. 
This algorithm is very simple: any sensor in S that does not intersect 
n /in ($) cannot contain the correct value, and is therefore incorrect and has 
failed. 

Algorithm 2 Detecting failed sensors. 

Specification: Given n sensors S and a maximum number of 
faulty sensors /, find a subset of the sensors in S that are in- 
correct. 

Implementation: Let fl/ in (5) be the interval calculated by al- 
gorithm 1. If W is the set of sensors that are known to be 
incorrect, add to W the sensors 

{s :s € SAsn(n /( „(S)) =0} 

It is likely that algorithm 2 will fail to detect some of the incorrect sensors. 
For example, using algorithm 2 with the sensors in figure 3 yields W = 0, 
since we cannot tell which of the two sensors a, c is incorrect. 

So far, we have assumed that once a sensor fails it remains failed, so once 
a sensor is added to W it will remain in W. This assumption may not 
be realistic for sensors, since an abstract sensor may maintain no state. It 
seems natural to assume a sensor may occasionally fail in an apparently 
malicious way, and then “heal” itself and subsequently yield correct values. 
So, a natural extension to the arbitrary failure model is to assume that 
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at all times t there exists a set of faulty sensors f(t) such that f(t) may 
differ from iF(t') when t ^ t ' , and Vf : |^"(f)| < f. Unfortunately, we cannot 
construct a correct abstract sensor under these conditions. We must also 
guarantee that there exists a period A such that the number of failures in 
all intervals of time no longer than A the number of failures is bounded: 

3A > 0 Vf : |{ s € 5 : 36 : 0 < S < A : s € + $)}| < / 

If algorithm 1 obtains fresh values from each physical sensor used in calcu- 
lating S and algorithm 1 runs no longer than A seconds, it can still be used 
to construct a correct sensor. As A — ► oo this model becomes the earlier 
arbitrary failure model. 


3.4 Proofs 

To prove theorems 1 and 2, we will need a few lemmas. 


Lemma 1 Let S be a set of intervals containing at least one c-clique. Fur- 
thermore, suppose that all c-cliques in S have exactly i intervals in common 
with each other. Then, 


|S|> 


i i > c 

2c — i i < c 


Let 5 be a smallest set of intervals such that all c-cliques in 5 have exactly 
i intervals in common. By definition, this set must contain an i -clique, so 
n > i. If i > c, the smallest set consists of only the i— clique, so n = i. If 
i < c, the set must contain more than one c-clique, for otherwise the single 
c-clique has c intervals in common with itself. Since 5 has the smallest 
possible number of intervals, it must contain two c-cliques. These two 
cliques have i intervals in common, so each has c—i intervals not in common 
with each other. The minimum number of intervals meeting this condition 
is n = 2(c — t) + i = 2c — i. □ 


Lemma 2 If n > 2c, there exists a set of intervals S such that all the 
c-cliques in S share no intervals. 
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Let 5 contain two distinct c-cliques and n — 2c distinct 1-cliques. This set 
satisfies the lemma. □ 

Lemma 3 Given a set ofn intervals S containing at least one c-clique ( n < 
2c), the smallest number of intervals i in common with all c-cliques in S is 
n — 2c. 


Suppose S has no more than i' < i intervals in common with all c-cliques. 
By lemma 1, the smallest such set contains 2c — i' > n intervals, a contra- 
diction. □ 

Lemma 4 Let s G S be any member of all maximal cliques of S. The closure 
of the intersection of the maximal cliques is no larger than s. 

The intersection of any maximal clique cannot contain any point outside of 
s, since by definition that point is not in an intersection containing s and 
5 is a member of each clique. The closure only adds points between the 
intersections. Since 5 is a set of intervals over the reals, s must contain all 
points between the maximal cliques, so the closure does not add any points 
in s. Since all the points in the closure are also in 5, the closure cannot be 
larger than s. □ 

Theorem 1 can now be shown. By algorithm 1 the maximal clique in S 
must be at least as large as /, for otherwise fl/ in (5) = 0. By lemma 3 
at least n — 2/ intervals intersect all cliques. By lemma 4 the closure of 
the intersection cannot be larger than any of these n — 2/ intervals. The 
closure, however, may be larger than any of the remaining 2/ intervals. In 
the worst case, these remaining intervals are the smallest ones in <5, and 
the theorem follows. Additionally, by lemma 2 we know that the bound on 
/ is an upper bound. □ 

Theorem 2 can now be shown. By theorem 1, 

|n/,„(S)| < max n _ 2 /{Vs G 5} 

For |n / >n («S‘)| to be bounded by a correct sensor, n — 2/ > f or n > 3/. 
This corresponds to the faulty sensors being the most inaccurate, so 
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|n/,„(S)| < min/+i{Vs 6 C} 


□ 

4 Example 

The methodology presented in this paper is not an automatic one. It may 
take some work to use abstract sensors, in that the original specification 
may have to be changed to accommodate abstract sensors, and it may be 
difficult constructing a set of independent abstract sensors. In this section, 
we show how a specification can be converted from using physical state 
variables to using abstract sensors, and how an abstract sensor can be 
implemented from a physical sensor. 

As part of the Cornell Real Time Reliable Distributed Systems project, we 
are developing a correct process control program from its specification. One 
of the problems we have chosen is that of a train traversing a sequence of 
n track segments. Associated with each track segment i is a track circuit 
that, when nonfaulty, is true if and only if the train occupies that track 
segment. Assume that segment i spans the positions c, through c 1+ i where 
(Vi : 0 < i < n : c< < c,+i). The train has position x and velocity v, has 
zero length 6 , starts at position Co and moves in the direction of increasing 
x (towards Ci). Each track segment has a minimum and maximum speed 
mini and maz,; if the train exceeds these limits, it will derail. Additionally, 
there is a random communications delay associated with all messages in the 
system that is bounded by 6 seconds. 

Our safety condition is 

S = f c, < x < c,+i =» mini < v < maz, 

This condition is expressed in terms of physical variables, so we need to 
change 5. The obvious condition is 

S' == (3x € 1 : c, < x < Ci+i) => 

6 This is not an unreasonable assumption; given a train of length L and a track system 
K , one can construct another track system K' on which a train of zero length is constrained 
in exactly the same way as the Z-length train is on K. 
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(Vu 6 v : mini < v < max,) 

since this also excludes the unsafe states (at a penalty of running the train 
conservatively). 

We will show how an abstract sensor of position x, can be constructed from 
the track circuits a The simplest way to do so is to assume a bound the 
velocity of the train v < v max . Define the global array: 

var trainfi]: (before, in, after} := before; 

Define a polling process for each track circuit «t,. Note the delay is repre- 
sented by a delay statement; the implementation must ensure that no more 
than A seconds elapse between successive polls of a sensor. The value of 
A must be small enough that the polling process does not “miss” the train 
traversing the track segment it is monitoring: A < (c i+ i - c, - 6v max )/v max . 
The assertion I is a loop invariant, and t is the current time. 

process Poll[i] = 
begin 

{/ : trainfi] = before => 0 < x(t ) < c, + Sv max A 
train[i] = in =► c, < x(t) < c, +1 + 6v max A 
trmn[i] = after =» c l+1 < x(t) < c n } 
do true — * 

delay A; 

if cr,A (trainfi] = before) — * train[i] := in 

□ -><r,A (trainfi] = in) — *■ trainfi] := after 

□ -'O', A (trainfi] = before) — *■ skip 

□ (Ji A (trednfi] = in) — ► skip 

□ (trainfi] = after) — * skip 
fi 

od; 

end 

The abstract sensor, defined functionally, comes from the loop invariant I 
and the distance the train could have moved since the last time < 7 j was read: 
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x, = if train[i] = before — ► [0 .. c, 4- (£ + A)u moi ] 

□ trainjij = in -> [c, .. c i+l + (6 + A)u mar ] 

□ train[i] = after -+ [c t+1 .. c n ] 

fi 

This is a simplistic abstract sensor for x. One can define a more accurate 
Xi by noting the time a track circuit first comes on. The implementation 
of this sensor is more complex, but has a structure similar to the one given 
here. 


5 Discussion 

This paper presents a four-step process, through which a program written 
in terms of physical state variables is transformed to one that reads the 
physical state variable through a set of physical sensors, some of which 
may be faulty. The degree of sensor replication depends on the failure 
model we assume. Figure 7 summarizes the maximum number of faulty 
sensors that can be tolerated for the three failure models considered in this 
paper. 


Failure Model 

fmax 

min n: / = 1 

arbitrary failures with 
unbounded inaccuracy 

L(" - 1)/3J 

4 

arbitrary failures with 
bounded inaccuracy 

[(» - mi 

3 

fail-stop failures 

n — 1 

2 


Figure 7: Maximum failures for different error models 

The methodology presented here is incomplete, however. For example, 
there are other reasonable failure models that we are investigating. While 
some of these will undoubtedly reduce to the models presented here, others 
may not. We have also only considered sensors that read a single physical 
value from a real domain. There are other kinds of sensors; for example, 
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a sensor denoting whether or not a door is open, or a sensor that returns 
the altazimuth coordinates of an airplane. We are currently extending the 
material in this paper to accommodate these more general sensors. 
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